Exploiting Structure for Intelligent Web Search

نویسنده

  • Udo Kruschwitz
چکیده

Together with the rapidly growing amount of online data we register an immense need for intelligent search engines that access a restricted amount of data as found in intranets or other limited domains. This sort of search engines must go beyond simple keyword indexing/matching, but they also have to be easily adaptable to new domains without huge costs. This paper presents a mechanism that addresses both of these points: first of all, the internal document structure is being used to extract concepts which impose a directorylike structure on the documents similar to those found in classified directories. Furthermore, this is done in an efficient way which is largely language independent and does not make assumptions about the document structure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing Intelligent Search Engines

Developers of search engines today do not only face technical problems such as designing an efficient crawler or distributing search requests among servers. Search has become a problem of identifying reliable information in an adversarial environment. Since the web is used for purposes as diverse as trade, communication, and advertisement search engines need to be able to distinguish different ...

متن کامل

Modeling and Exploiting User Search Behavior for Information Retrieval

The ongoing explosion of web information calls for more intelligent and personalized methods towards better search result quality for advanced queries. Query logs and click streams obtained from web browsers or search engines can contribute to better quality by exploiting the implicitly embedded preferences and aggregated recommendations. This work addresses approaches for incorporating implici...

متن کامل

A New Hybrid Method for Web Pages Ranking in Search Engines

There are many algorithms for optimizing the search engine results, ranking takes place according to one or more parameters such as; Backward Links, Forward Links, Content, click through rate and etc. The quality and performance of these algorithms depend on the listed parameters. The ranking is one of the most important components of the search engine that represents the degree of the vitality...

متن کامل

Exploiting Personal Search History to Improve Search Accuracy

Personal search history is an important type of personal information, from which we can learn a user’s interests and information needs, thus improve the search service for the user. In this paper, we describe our recent work on User-Centered Adaptive Information Retrieval (UCAIR), which aims at capturing personal search history with a client-side search agent and exploiting the history informat...

متن کامل

Intelligent Web Crawling

Web crawling, a process of collecting web pages in an automated manner, is the primary and ubiquitous operation used by a large number of web systems and agents starting from a simple program for website backup to a major web search engine. Due to an astronomical amount of data already published on the Web and ongoing exponential growth of web content, any party that want to take advantage of m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001